rank | frequency | n-gram |
---|---|---|
1 | 24240 | -n |
2 | 23083 | -a |
3 | 13631 | -i |
4 | 7582 | -s |
5 | 7186 | -r |
rank | frequency | n-gram |
---|---|---|
1 | 17298 | -an |
2 | 10100 | -ya |
3 | 5488 | -ng |
4 | 3105 | -si |
5 | 2933 | -ah |
rank | frequency | n-gram |
---|---|---|
1 | 9314 | -nya |
2 | 5340 | -kan |
3 | 2081 | -ang |
4 | 1746 | -asi |
5 | 1588 | -ing |
rank | frequency | n-gram |
---|---|---|
1 | 2601 | -nnya |
2 | 1286 | -ngan |
3 | 1123 | -inya |
4 | 1040 | -anya |
5 | 1017 | -akan |
rank | frequency | n-gram |
---|---|---|
1 | 2317 | -annya |
2 | 669 | -angan |
3 | 428 | -sinya |
4 | 418 | -ngnya |
5 | 382 | -ngkan |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings